FP-Split SPADE-An Algorithm for Finding Sequential Patterns
نویسندگان
چکیده
Sequential Pattern Mining (SPM) is one of the key areas in Web Usage Mining (WUM) with broad applications such as analyzing customer behavior from weblog files. The current algorithms in this area can be classified into two broad areas, namely, apriori-based and pattern-growth based. Apriori based algorithms for mining sequential patterns need to scan the database many times as they focus on candidate generation and test approach. A lot of research has been done so far, but even the best apriori based algorithm for SPM in terms of number of database scans is SPADE that scans the database three times for discovering sequential patterns. Pattern growth based algorithms avoid the candidate generation step and the best pattern growth algorithm known so far is Prefix Span that needs to scan the database at least twice. In this paper, a novel algorithm for SPM is proposed called FP-Split SPADE that reduced the database scan to only one by creating an FP-Split tree and applying SPADE algorithm on the tree instead on sequence database that greatly improved the efficiency of mining sequential patterns.
منابع مشابه
Mining Sequential Patterns in Dense Databases
Sequential pattern mining is an important data mining problem with broad applications, including the analysis of customer purchase patterns, Web access patterns, DNA analysis, and so on. We show on dense databases, a typical algorithm like Spade algorithm tends to lose its efficiency. Spade is based on the used of lists containing the localization of the occurrences of pattern in the sequences ...
متن کاملGO-SPADE: Mining Sequential Patterns over Datasets with Consecutive Repetitions
Databases of sequences can contain consecutive repetitions of items. This is the case in particular when some items represent discretized quantitative values. We show that on such databases, a typical algorithm like the SPADE algorithm tends to loose its efficiency. SPADE is based on the used of lists containing the localization of the occurrences of a pattern in the sequences and these lists a...
متن کاملSurvey of Sequential Pattern Mining Algorithms and an Extension to Time Interval Based Mining Algorithm
Sequential pattern mining finds the subsequence and frequent relevant patterns from the given sequences. Sequential pattern mining is used in various domains such as medical treatments, natural disasters, customer shopping sequences, DNA sequences and gene structures. Various sequential pattern mining algorithms such as GSP, SPADE, SPAM and PrefixSpan have been proposed for finding the relevant...
متن کاملFast Vertical Mining of Sequential Patterns Using Co-occurrence Information
Sequential pattern mining algorithms using a vertical representation are the most efficient for mining sequential patterns in dense or long sequences, and have excellent overall performance. The vertical representation allows generating patterns and calculating their supports without performing costly database scans. However, a crucial performance bottleneck of vertical algorithms is that they ...
متن کاملA Survey on Algorithms for Sequential Pattern Mining
Sequential pattern mining is a very useful mining technique for various sectors like healthcare, retail business, DNA analysis etc. It generates patterns which are frequently occurring in given sequence of transactions. It uses sequence database having sequence of transactions with transaction time. In sequence database every transaction is having various items. By sequential pattern mining use...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016